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ABSTRACT 


When testing with the t-test, it is assumed that the 


sample under investigation is from a normal population. The 


purpose of this thesis is to examine the sensitivity of the 


t-test to violations of this normality assumption. <A com- 


puter simulation was performed to draw sets of 
from an infinite Weibull population. A t-test 


on each sample to test the null hypothesis Bo? 


oy was the true mean of the Weibull population. 


ome Gimes that Ho was rejected was recorded for 


tions of eight levels of significance, samples 


10,000 saroles 
was performed 
By Son where 
The number 
all combina- 


ranging in 


eeeze from 2 to 31, and for values of the parameters of the 


Seer OutTion A = 1,2,3 and 6 = 1,2,3. 
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A. GENERAL 

In statistical experimentation it is sometimes desired 
mo.uest the assumptiom that the mean, Ww, of a Statiasracal 
population is in some way related to a hypothesized value 
Uo: IVG) Se vehee oe ESS ules em , 6) sahil Jat ete aie sim s Ho is 


formulated about u, and tested against an alternative 


mypothesis, H, by taking a sample from the popuiation under 
= Le 
O 





investigation and forming the t-statistic t = In 


n Sa. ) 


mgs transformation x = = =& x, is the sample average, 
io Eee k=1 

S = —_ y (X)-x) is the unbiased sample standard deviation, 
k=1 


n-1 
nis the sample size, and Wy is the hypothesized population 
mean. 

tine enumudeative GF-distribulion hase becni sv cbwied Shean aie ms 
Meaues Of my and levels of significance ao. When testing che 
mut hypothesis Bo: ac against the alternative hypothe- 
eees H: wu > Ho» tne Drocedire Ws TO rejyecu Ho if the calcu- 
lated t is greater than or equal to the tabled (critical) 
en 1) (1-0) for n-l degrees of freedom at the l-a conficence 
level where Bena) Clone is obtained from the upper tail of 
mee t-distribution. Similarly, if the null hypothesis 
Ho: Phi mig. is tested against H: uw < Uo» Hla iolbliik Joleen lee sas 
mecejecced if tas Una) ¢ ee Wee SE Get) Clee) 1s 
Sotvained frem the lower tail of the t-distribution. [ng@emae 


case of a two trailed test, the null hypothesis Hee Leo 








is tested against H: u # WH, and is rejected if 


t < -t Ohewalie pe Demon o1 


(n-1)(1-a/2) (Goel) Chaser \ 


B. BACKGROUND AND PURPOSE 

When testing with the t-test it is assumed that the 
sample under investigation is from a normal population. In 
general, the purpose of this thesis is to examine the 
Somolvuivity of the t-test to certain violations of this 
normality assumption. 

some previous writers, e.g. Bartlett [1] have investi- 
pated the theoretical distribution of the t statistic, when 
sampling from an infinite non-normal population. Bartlett 
eoncludes from his study that even though his work was 
incomplete, and not of much quantitative value, it does 
indicate that for moderate departures from normality the 
t-test may still be used with confidence, particularly for 
testing differences in means of equal numbers of observations. 

In a different approach, Pearson [6] describes how he 
mechanically drew samples from an infinite non-normal popu- 
lation. In the case where the means of only two samples 
were being tested for equivalence, the value of t was cal- 
culated for each sample to empirically obtain some idea of 
the frequency distribution. Using a chi square test to fit 
mac observed t to a theoretical t€ distribution did nop 
eepecar te bring out any systematic diserepancy. Taken as 2 
moole the values of va were hieher thanmalenwild soe =exwpecv cd 


ie thie variation from theory was solely due to chance. Also 





the fits on the whole were better for larger size samples. 
However, Pearson never drew more than 21000 samples. A 
greater number of samples is needed to determine the five 
percent point, and more seo thesome percent epoimue commie e tic 
Minoecir Of relectionsesacveumesece Neves 15 ese mews 
Specifically, this thesis examined the effect of samoling 
from a non-normal population, on the number of times the 
null hypothesis was rejected (given that the null hypothesis 
was true) when testing with the t-test. The probability of 
een an event is commonly kmown as a type Iverror. If the 
ebserved number of rejections obtained when sampling from a 
non-normal distribution is near the expected number of 
ieeyecvions that should be obtained by sampling from a normal 
mer Ducion, if Will be possible Ce Wse thee —-Cab le gaa. 
we sample had come from a normal distribution. However, if 
the observed number of rejections (when sampling from a non- 
normal ee ripeeaees is significantly different from the 
expected number of rejections that should have been obtained 
By Sampling from a normal distribution, it will be necessary 
to adjust the procedure for using the t-table to estimate a 


eritical t value. 


C. WEIBULL DISTRIBUTION 


The non-normal distribution of interest is the Weibull 


7 ees 

distribution with distribution function f (x)= ABx® oa 
A 

The expected value of the random variable X is A “O(R4+1) 


and the variance is 47278 { r(2/8 +1) - (r(a/e+1)174 (Petes. 








The Weibull, distribution frequently appears sine baelaesy 
theory and life testing, where the random variable X repre- 
sents the time between failures. When the shape parameter 
B=l1 the Weibull distribution reduces to the exponential 
distribution which has applications in queueing theory as 
well as reliability theory and life testing. 

The Weibull distribution takes on a variety of shapes, 

@ 

depending on the value of the parameter 8. The spread of 
the distribution is determined by the value of the parameter 
X. One might therefore expect that the "t-statistic" 
obtained by sampling from a Weibull distribution will some- 
mow depend on the values of 6 and A. If this is true, the 
mmber of rejections, Biven that the null hypothesis as 
true, will also depend on the parameter values. To examine 
Mmbecoe POSSIDI ity, combinations Of A=13203) with 6) — eens 
were used to develop 9 distributions from the family of 


Weibull distributions. Tables I and II below give the 


m-sulvant means and variances for the 9 sets of parameter 


values. 
JMS, il TAB ICE il i 
Means Variances 
Gee 2 3 ae i 2 3 
IL ie CG 0.886 0.893 il 1.000 1 eile C210C7 
2 02500 0.626 C2706 2 0.250 dha 0.0664 
3 0) 3) 8) 3 OPA sve Oaks 3 Oeil: Or O 72 Oj 050r 
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Figure 1 shows the shape.of thewWetbpuly dust remem d 


for the parameter values above. 


fo 





ala 





Pi. MEO 


A. GENERAL 

To examine the "robustness" of the Student t-test when 
sampling from a Weibull distribution, a computer simulation 
was performed to repeatedly draw samples of size i from an 
infinite Weibull population. A t-statistic was calculated 
for each sample and used to test the null hypothesis that 
the sample was drawn from a population with a mean equal to 
or less than Uo: To ensure That the hypothesis was in fact 
meme, the hypothesized mean Ue was set equal to the mean u 
of the Weibull population. Each time that the null hypothe- 
Pew ooe Me |CCmee Vile CS immerse recorded Tor ine se 7cimmed 
significance, fee at whieh Che test was conducted, and che 
size of the sample. The total number of observed rejections, 
Page was computed at each of eight different levels of 
Significance re jai ae. sso an the’ t=tes.. and tows amolecson 
eee 1 = 2,.8.,351. 

Por Veomparison, and te assistein validaving tne Compuuver 
program, the entire experiment was repeated with sampling 
Meo a Standard normal distribution. As with the Weibull 
Pease, compuver simulation was used to repeatedly draw 
samples of size i. For each sample, a t-statistic was 
calculated and used to test the null hypothesis that the 
sample was drawn from a population with a mean u equal to 


or less than Wo: Once again the hypothesized mean HU, was 


Ke 











set equal to the population mean u = 0. The number of 


rejections was weeorded as im the Weibull eace. 


mB. SAMPLING TECHNTOUE 

Using the random mumber generator RANDU provadeqd tor 
Fortran IV with the IBM System /360 Source Library, uniform 
random variates were generated on the interval (0,1). 
fee. Bramhall {2] in a report that discusses a comparison 
of three uniform random number generators for the IBM 360, 
fMmcrced RANDU to a uniform (0,1) distribution with a Chi 
Square test at the 95% confidence level. 

aie Unt iorm waademy all ariecs sob tar iteG ft .-OlMeni ls ieee 
Subsequently used to produce Weibull random variates by 
mae anverse transiormation method [4]. Siwee the cumulative 
frequency distribution F(x) ranges over the interval (0,1), 
the uniform (0,1) random variates, V, that were generated 
from RANDU were set equal to F(x). Solving the resultant 
equation for x produces random variates with the distribu- 
tion function desired. 


B 
in tae Casemer othe Wet puted s bireibutrtons h(x). — ee 


Setting H(x) V and solving for x yields 


Oe) fae) See 


B 
£n e = fn (1-V) 


mmece Lhe Gistribution of V is symmetric 


ILS 








8 -£n V 
einer 
on Vv 1/8 
Site r 


ewe wey represents Unatoerm random variates On themamberva | 
(0,1), and > and 8 are again the parameters of the Weibull 
Gist ribution. 

Normal random variates were obtained using the sub- 
routine GAUSS provided by the IBM System/360 Source Library. 
The Central Limit Theorem states that the probability 
Seetriabution of the sum of n independent and identically 


Gasotributed random variables Kas alice wtechel tl. selacl Weeuio | BCS 


a 
ia approaches asymptotically a normal distribution with 
2 ae 2 eee ae 
mean WU and variance o , where pw = 2u, ando = to, . 
i=l * i 
Subroutine GAUSS calls RANDU to produce n uniform random 
uniform random variates V5 3 on the interval (0,1). The 
n 
expected value of the sum E( 2 View) = =: and the variance of 
n i=1 
the sum Var( V,) = ae Making the transformation 
i=l, 
ny Uys 
Z= Y-E(Y) _ ee ec a standard normal random 
YV(Y) ¥n/12 


Maraiave. A normal random variate X with any desired mean 
uy and variance a7 is obtained from XK = 0,2 + Hy. Sub- 
routine GAUSS sets n at 12, which eliminates the radical 


in Z and speeds up the computation. 
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CPIM SUIS OMe OMe 

Por each sample, the sample average and standard devia- 
paOn were calleudarved. A b=CeSstawWwas (omen ecOnalic Vedmeime en 
sample to test the null hypothesis Ho? aes Wy where U, was 
the true mean of the population from which the sample was 
drawn. The calculated value of t was then comvared with 
the critical (tabled) t for the appropriate sample size 
Meee? ,...,31, and Significance level Os» je= 1,20 090em Te 
the calculated t was equal to or greater than the critical 
t the null hypothesis was rejected. The number of rejec- 
tions Tia was recorded for each level of significance oe 


ee which the null hypothesis was tested, and for samples of 


emeze 1 = 2,...;31. 


1S 








LI METHOD SC rash Aa ois 


Onee the snumbereef seen eceraeas 1 was determined (given 


an 
that the null hypothesis was true) for each level of sig- 
nificance Os» j = 1,...,8, and for all sample sizes 

ie— ¢,.++,51., 1b was necessary to determine if the number of 
observed rejections were significantly different from the 
expected number of rejections 256 Tie Se xwee Ge ce milo reme ms 
rejections, assuming a normal population, was obtained by 
makame the product of the probability of a type I error; «a 
(i.e., the probability of rejecting the null hypothesis when 
metact the null hypothesis is true), and the number of 
times the test was repeated with a different sample. For 
feet size sample, the null hypothesis wu < HW, was tested for 
10,000 different samples at the Os level Ol 3S ea Te ames 
mide Values of Oe that were used, and the resultant expected 


mumber of rejection ee is shown below in Table III. 


TABLE bail 
j 2 3 y 5 6 7 8 
Ot aS. 5 210, FAS Oo 205 2025. 00s S000 
e ,=(10,000)a, CUO 2 COO T0080 500 20 5110, D 


Te devtermime iisthe Observed NUMmberee! ~reieculoms was 
significantly different from the expected number of rejec- 


tions, a Chi Square test with one degree of freedom was 


16 











x 
eonducted to evalvate the nul) hyecenesis Hh ee = 


ty i 
for each value of “Ga Table JV eSisethe.contincency ctaple 
for the Chi Square test. 
TABLE IV 
Number of times Number of times 

Ay accepted A rejected Mota lL 
Expected number NO ey 10000 
Observed number OCCU rs. ee 10000 

1J iJ 


The Chi Square statistic was obtained by calculating 


} 2 
2 (e5 - 43) (Ts 4-3) 2 
= + VJ 
x “10000-e, a When the calculated y was 


2 


greater than the critical x“ with one degree of freedom 


¥ 
av the 1 - aconfidence level the null hypothesis oe : Ps i785 
was rejected. An example of the method of analysis is 


Paevern) in section VII. 
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iV... SOUL EUa 


The output obtained from the computer simulation was ay? 


the observed number of times that the null hypothesis 


ioe sh Was rejected) when in mace stne nud i Shy wer ieosns 


O 


was true. As previously indicated ce was obtained for 
j =1,...,8 levels of significance in the t-test, and for 


Sep les ranging in size from i = 2,...,3l1. For each eae 


mc probability of a@ type I error, Yay was obtained by 


mone Ghe ratiovwen the number Of observed rej)ecti ens aco 
ee e 
= = at 

the number of samples M 10,000 Ces ted. era. . Yay T0000 


Appendices A-J table the values of Va5 fOorseach One uhic 


ten cases examined (sampling from a standard normal distri- 


Pacton, and sampling from a Weibull distribution with nine 


sets of parameter values). The values of the index 
mH = 2,..+;31 again represent the sample sizes used, anda 
j =1,...,8 reference the levels of significance for which 


each sample was tested (see Table ITI). Probabilities that 
appear with an asterisk, i.e. te represent a situation 
mieecre the number of observed rejections ray was significent- 
ly different from the expected number of rejections e5 at 
the .01 level of significance. Probabilities appearing with 


pmcCiCeGia. i .c. v4) represent the same situation at the .05 


level of significance. 





Vie led SDL ess 


A. NORMAL CASE 

In review, for the normal case samples were drawn from 
a standard normal distribution. A one tailed t-test was 
eonducted on each of 10,000 different samples at eight 
levels of significance to test the null hypothesis 
Ho: Les. Ho: The process was repeated for samples ranging 


ieesize from ¢cé toe 3. 


AS anticipated the observed number of rejections rr, 


Ly 
did nearly equal the expected number of rejections e, Te 
® 
peeecases., im iaecv tne mull nypotnesas eh : sy = 24 was 


accepted for all Ps 3 at the .01 level of significance and 
rejected for only eight of the 240 Pay at cites 05 levellmon 
peeeeniticance. The rejections that did occur were alcributed 


woe the stochastic nature of the testing procedure. 


Appendix A tables the results for the normal case. 


Be WHIBULL CASE 

The results of the simulation changed significantly 
when samples were drawn from a Weibull distribution. 

In general, for a given value of the parameter 8, and 
for fixed i and j, the observed number of rejections Pay 
was relatively insensitive to changes in the parameter i. 
However, for a given A, as 8 increased, the sj increased 


# 
causing a decrease in the number of times that He > r..=e., 


was rejected. In the three cases where an leit elo elev 


Ie 








eXCep UOMO eS Ge cmimn > ae HS UY abet ne: 000 pe yee 


H : r.. = @, was rejected for all r at the .Ol1 level. 


O ay J 1J 
Wien Ves Ginn e Hot HS uy at the JOOCp mee hi Ho oe e; 
wes —crejected for Mmeste sou ne Os ag thie =05 eye mies 
wise Ho: ee e. was accepted. The results were essen- 
tially the same for the three cases where 8 = 2 with only 
a slight decrease in the total number of times Ho”: Peg es 
was rejected. In the three cases where 8 = 3, the results 
were closest to the results expected if sampling had been 
moma normal distribution. In face, for A =eBethe null 
hypothesis Ho: te ar Viel accepted for 206 of the 240 Pay: 

With few exceptions, the values of Taj Ve nee Caine 
increase as the sample size increased. This increasing 
meend was difficult to detect for small values of a, 
possibly because the increase was masked by the random 


fluecuations in ae i DIOS ES fee Were we tieeac, aie ticles 


A final result of the experiment was that the Bes were 


usually less than the expected number of rejections. In 
fact, the stronger hypothesis ee Pj Ss e. was accepted at 
the .95 level of confidence in all cases for all Pag. 


eae) 








Viw CONCLUSIONS 


The results of the experiment suegeest thay tae aie 
of the t-test 1s Sensitive to the assumption of normale, 
if sampling is done from a Weibull distribution with the 
merameter values chosen in this paper. Accepting the null 
hypothesis Hott SG < e, for any sample size and level of 
significance implies that the probability of rejecting a 
meme hypothesis is tess when sampling from a Weibull distri- 
pucvon than when sampling from a normal distribution. This 
will tend to cause the experimenter to announce too few 
Significant results if the t-table is used as if sampling 
from a normal distribution. However, since the probability 
of making a false rejection Yay: when actually testing at 
mre 2 level of Significance has Now been determined tor The 
Weibull case, the problem of too few significant results 
can be overcome for any sample size by finding the critical 
t value corresponding to the desired level of significance 
Yay: This procedure will be demonstrated in an example in 
mie wnexl SeECL On. 

In order to obtain a better estimate of the probability 
en rejecting a true hypothesis at the .0005 level of 
Significance, more samples are needed. At this level of 
Significance with 10,000 samples, the expected number of 


weJECEPONS 15 Omly five, TMhevamount 61 Gevta creme ol erie 


expected number of rejections was such that no rejections 


eu 





SS => Gr 





frequently occurred in 10,000 samples] (ihre amp ites suas 
the probability of rejecting Gee elew yy pOUuneCoioss cm Zc ce 
when testing at the .0005 level. However, based on the 
hypothesis that the observed number of rejections is equal 
to or less than the expected number of rejections, the 
probability of rejecting a true hypothesis when testing 
ore rire =. 0005 Ihe us Domed Demween Zero auicmrre Olle 

The fact that the By: increased as the sample size 
increased is supported by the Central Limit theorem. For 
large samples, the "pseudo t-distribution" formed by 
sampling from a Weibull distribution asymptotically 
asooroaches a normal distribution, i Since this ms also True 
of a "real t-distribution" where sampling is from a normal 
distribution, the observed number of rejections obtained by 
pampling from a Weibull distribution will approach the 


expected number of rejections for increasing sample sizes. 


Ze 








Ne ee Bea ioe: 


to LiJustrate the method of analysis, and ve ceroncuracuc 
a procedure for using a t-table to estimate a critical t 
value when sampling from a Weibull distribution consider 
the following example. 

From Appendix B the value Yus = ,~JLOOsi mp Lites r ula tier 
samples of size four, and testing at the .05 level of 
euenificance with a one tailed t-test, that the probability 
of a type one error is estimated to actually be .0100 when 
sampling is from a Weibull distribution. If Vas = .0100 
then Pus = 100 which implies 100 observed rejections of the 
pus hypothesis Ho: u<l.0. Filling in Table IV gives the re- 


Sults below. 


Number of times Number of times 
A accepted Ay reieeved sMohvr-me 
Expected number 9500 500 LO000 
Observed number 9900 Oo ILLOLONA TG 


2 _ (500-100)* , (500-100)* _ 
~ 500 9500 


336. Since 336 is larger than the critical Chi Square with 


Moe Chi Square statistic becomes yx 


one degree of freedom at either the .05 or .01 level of sig- 
Mebncanece, the nubi hypothesis Ho Pig = 500 15 Peeqecved. 
It can then be concluded that when sampling from a Weibull 
G@Gistribution with A = 1, 6B = Land testing with @ one tailed 


t-test at the .05 level of significance for samples of size 


23 
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four, that the observed number Of ej ecyieids ams bere elas 
different from the expected number of rejections had the 
sample been irom a nermal Gist rapution Ss Ccnscovent ree 
probability of avy oc! creer, Woe sampling from the above 
Weibull distribution, is significantly different from the 
expected probability of a type I error when sampling from a 
monmmnal Gistributiom. Im fact, simee the stronger Wy pornecas 
Hoo: Ps = 500" comme accepred, the probabdlivy elma wee 

I error under the above conditions is less than the prob- 
ability of a type I error when sampling from a normal dis- 
tribution. AS previously mentioned, this will cause the 
experimenter to announce too few significant results if a 
Mantle 15 used as af Sampling from a normal distripuriven, 
The experimenter is now faced with the problem of determin- 
mie a critical value that corresponds to the probability of 
Emeype f error for samples from a Weibull distribution. 
Returning to Appendix B to test the null hypothesis 

Ho? wes UY PORmesamples Ol #5476 FOUR Tay) TiC Op mle wc lanes 
Significance, where it is known that sampling is from 
feomonll disturibpuvion, it is observed that Yas = O05 tats 
between the a = 0.15 and a = 0.10 columns. By entering 2 
t-table at either the a = 0.15 or a = 0.10 level, for 
samples of size four, the experimenter will obtain an 
estimate of the critical value that corresponds to testing 
af the 0.05 level of significance for samples from a 


VetUUE ea Stributvorm..  CHOCS Ime C—O > ey clebe cco Ullbmeniene, 


larger critical value and a more conservative test. 


ay 








Vitt. BATE NSECNs 


The choice of the Weibull] distribution was mest, eamwea 
trary , ven 2necugh Trererence was made Vorivs app eircauerea)) 
mee liabality theory esean@. fe testime.,. Ines pessm@baiitrc seem ene 
extending this investigation to other non-normal distribu- 
mronomcaire NUMneCitCOUS wml addi i OMe tO. Fics Ommon ec ass ramme tibae@lds 
tee KMOWN distr@puvion funetions, bimodal and truncaved 
distributions warrent investigation. It would also be 
interesting to examine the robustness of the t-test when 
sampling from non-normal distributions when two means are 
being compared. Also of interest would be the case where 
the two samples are from different non-normal distributions. 
mrad caved, tne possibidities Cor extensions eare mmumeweus 
mee ma ped On ly sb Sl nese xe ri ment eis vie. limita C a ie mmc el 


needs. 
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